{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Cattle Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using the geostates package" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`geostates` can be used to create choropleth plots of the United States or individual states. It is easy to use\n", "so we will start out with an example to show you some of the ins and outs of the package." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cattle analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Goal:** To illustrate the power of the package, we will start out by creating a plot that shows how the number of cattle varies by state in the United States." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will start by importing the `pandas` and `geostates` packages." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading in the data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this example, we use data on US cattle from the [United States Department of Agriculture National Agricultural\n", "Statistics Service](https://quickstats.nass.usda.gov). The CSV includes the total number of cattle (including calves) in the United States as of January 2022 broken down by each state." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# read in the data\n", "cattle_data = pd.read_csv('Desktop/cattle_data_22.csv', index_col='State', thousands=',')\n", "cattle_data.index = cattle_data.index.str.title()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ProgramYearPeriodWeek EndingGeo LevelState ANSIAg DistrictAg District CodeCountyCounty ANSIZip CodeRegionwatershed_codeWatershedCommodityData ItemDomainDomain CategoryValueCV (%)
State
AlabamaSURVEY2022FIRST OF JANNaNSTATE1NaNNaNNaNNaNNaNNaN0NaNCATTLECATTLE, INCL CALVES - INVENTORYTOTALNOT SPECIFIED1260000NaN
AlaskaSURVEY2022FIRST OF JANNaNSTATE2NaNNaNNaNNaNNaNNaN0NaNCATTLECATTLE, INCL CALVES - INVENTORYTOTALNOT SPECIFIED18000NaN
ArizonaSURVEY2022FIRST OF JANNaNSTATE4NaNNaNNaNNaNNaNNaN0NaNCATTLECATTLE, INCL CALVES - INVENTORYTOTALNOT SPECIFIED960000NaN
ArkansasSURVEY2022FIRST OF JANNaNSTATE5NaNNaNNaNNaNNaNNaN0NaNCATTLECATTLE, INCL CALVES - INVENTORYTOTALNOT SPECIFIED1690000NaN
CaliforniaSURVEY2022FIRST OF JANNaNSTATE6NaNNaNNaNNaNNaNNaN0NaNCATTLECATTLE, INCL CALVES - INVENTORYTOTALNOT SPECIFIED5200000NaN
\n", "
" ], "text/plain": [ " Program Year Period Week Ending Geo Level State ANSI \\\n", "State \n", "Alabama SURVEY 2022 FIRST OF JAN NaN STATE 1 \n", "Alaska SURVEY 2022 FIRST OF JAN NaN STATE 2 \n", "Arizona SURVEY 2022 FIRST OF JAN NaN STATE 4 \n", "Arkansas SURVEY 2022 FIRST OF JAN NaN STATE 5 \n", "California SURVEY 2022 FIRST OF JAN NaN STATE 6 \n", "\n", " Ag District Ag District Code County County ANSI Zip Code \\\n", "State \n", "Alabama NaN NaN NaN NaN NaN \n", "Alaska NaN NaN NaN NaN NaN \n", "Arizona NaN NaN NaN NaN NaN \n", "Arkansas NaN NaN NaN NaN NaN \n", "California NaN NaN NaN NaN NaN \n", "\n", " Region watershed_code Watershed Commodity \\\n", "State \n", "Alabama NaN 0 NaN CATTLE \n", "Alaska NaN 0 NaN CATTLE \n", "Arizona NaN 0 NaN CATTLE \n", "Arkansas NaN 0 NaN CATTLE \n", "California NaN 0 NaN CATTLE \n", "\n", " Data Item Domain Domain Category Value \\\n", "State \n", "Alabama CATTLE, INCL CALVES - INVENTORY TOTAL NOT SPECIFIED 1260000 \n", "Alaska CATTLE, INCL CALVES - INVENTORY TOTAL NOT SPECIFIED 18000 \n", "Arizona CATTLE, INCL CALVES - INVENTORY TOTAL NOT SPECIFIED 960000 \n", "Arkansas CATTLE, INCL CALVES - INVENTORY TOTAL NOT SPECIFIED 1690000 \n", "California CATTLE, INCL CALVES - INVENTORY TOTAL NOT SPECIFIED 5200000 \n", "\n", " CV (%) \n", "State \n", "Alabama NaN \n", "Alaska NaN \n", "Arizona NaN \n", "Arkansas NaN \n", "California NaN " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# take a look at what the CSV file looks like\n", "cattle_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Cleaning the data**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It looks like our CSV file has a few extra columns including Program, Commodity, Domain, etc. that we do not need. It also shows a few columns that have missing (NaN) values. Let's start out by removing all of the unnecessary columns and removing all of the NaNs. Let's also rename the 'Value' column to 'Cattle' to make it more clear. Finally, by using the `type()` function we can check to see that the 'Cattle' column is of dtype `str`. We need to convert this to an `int`." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Cattle
State
Alabama1260000
Alaska18000
Arizona960000
Arkansas1690000
California5200000
\n", "
" ], "text/plain": [ " Cattle\n", "State \n", "Alabama 1260000\n", "Alaska 18000\n", "Arizona 960000\n", "Arkansas 1690000\n", "California 5200000" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# drop the NaN values and unnecessary columns\n", "cattle_data = cattle_data.dropna(axis='columns')\n", "cattle_data = cattle_data.drop(columns=['Program', 'Year', 'Period', 'Geo Level', 'State ANSI', 'watershed_code', 'Commodity',\n", " 'Data Item', 'Domain', 'Domain Category'])\n", "\n", "# rename the column from 'Value' to 'Cattle'\n", "cattle_data = cattle_data.rename(columns={'Value': 'Cattle'})\n", "\n", "# view the first five values\n", "cattle_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have the total number of cattle for each state we could visualize this by creating a choropleth map\n", "that shows the variation in total cattle inventory by state. While this is interesting, it might not fully capture\n", "the variation we are looking for. For example, bigger states like California and Texas are likely to have the largest total\n", "number of cattle. One interesting metric we can use to compare the relative values of cattle across multiple states\n", "is by computing the cattle to person ratio. This allows us to examine a state's total inventory of cattle relative to\n", "its population." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this, we will use population data from the [United States Census Bureau's Population and Housing Unit Estimates](https://www.census.gov/programs-surveys/popest/data/data-sets.html)." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# read in the data\n", "population_data = pd.read_csv('Desktop/state_population_21.csv', index_col='State', thousands=',')" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Population
State
Oklahoma3986639
Nebraska1963692
Hawaii1441553
South Dakota895376
Tennessee6975218
\n", "
" ], "text/plain": [ " Population\n", "State \n", "Oklahoma 3986639\n", "Nebraska 1963692\n", "Hawaii 1441553\n", "South Dakota 895376\n", "Tennessee 6975218" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "population_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's merge these two datasets together." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CattlePopulation
State
Alabama12600005039877
Alaska18000732673
Arizona9600007276316
Arkansas16900003025891
California520000039237836
\n", "
" ], "text/plain": [ " Cattle Population\n", "State \n", "Alabama 1260000 5039877\n", "Alaska 18000 732673\n", "Arizona 960000 7276316\n", "Arkansas 1690000 3025891\n", "California 5200000 39237836" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "merged_df = pd.merge(cattle_data, population_data, on='State')\n", "merged_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Analyzing the data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's compute the cattle to person ratio for each state and sort the list by descending values." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "State\n", "South Dakota 4.244027\n", "Nebraska 3.462865\n", "North Dakota 2.387257\n", "Kansas 2.214966\n", "Wyoming 2.159629\n", "Montana 1.992265\n", "Idaho 1.341454\n", "Oklahoma 1.304357\n", "Iowa 1.205733\n", "Missouri 0.654974\n", "New Mexico 0.614402\n", "Wisconsin 0.593632\n", "Arkansas 0.558513\n", "Colorado 0.455948\n", "Kentucky 0.447954\n", "dtype: float64" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# compute the cattle to person ratio by dividing the Cattle column by the Population column\n", "cattle_ratio = merged_df['Cattle']/merged_df['Population']\n", "\n", "# sort the values to see which states have the highest Cattle to Person ratio\n", "sorted_cattle_ratio = cattle_ratio.sort_values(ascending=False)\n", "\n", "# view the first 15 values of the sorted pandas series\n", "sorted_cattle_ratio.head(15)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is interesting! In fact, it turns out there are **nine states** where there are more cattle than people!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, let's append this as a third column to our original dataframe and round the values to three decimal places." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "# convert the series containing the ratio to a dataframe and merge it with the original dataframe\n", "final_df = merged_df.merge(cattle_ratio.to_frame('Ratio'), on='State')\n", "\n", "# round the values of the ratio column to three decimal places\n", "final_df['Ratio'] = final_df['Ratio'].round(3)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CattlePopulationRatio
State
Alabama126000050398770.250
Alaska180007326730.025
Arizona96000072763160.132
Arkansas169000030258910.559
California5200000392378360.133
Colorado265000058120690.456
Connecticut4700036055970.013
Delaware1200010033840.012
Florida1630000217811280.075
Georgia1050000107995660.097
\n", "
" ], "text/plain": [ " Cattle Population Ratio\n", "State \n", "Alabama 1260000 5039877 0.250\n", "Alaska 18000 732673 0.025\n", "Arizona 960000 7276316 0.132\n", "Arkansas 1690000 3025891 0.559\n", "California 5200000 39237836 0.133\n", "Colorado 2650000 5812069 0.456\n", "Connecticut 47000 3605597 0.013\n", "Delaware 12000 1003384 0.012\n", "Florida 1630000 21781128 0.075\n", "Georgia 1050000 10799566 0.097" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "final_df.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have a dataframe containing the ratio of cattle inventory to population we are ready to use `geostates` to visualize it!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize the data using geostates" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first step for using the `geostates` package is to load in the geodataframe containing all of the state values. For this, we will use the `load_states()` function and assign it to a value `df`. Once we've loaded in the geodataframe we need to merge it with out cattle data." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "# import the load_states() function from the geostates package\n", "from geostates.shapefiles import load_states" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
STATEFPSTATENSAFFGEOIDGEOIDNAMELSADALANDAWATERgeometry
STUSPS
MS28017797900400000US2828Mississippi001215335194813926919758MULTIPOLYGON (((-88.50297 30.21523, -88.49176 ...
NC37010276160400000US3737North Carolina0012592365606413466071395MULTIPOLYGON (((-75.72681 35.93584, -75.71827 ...
OK40011028570400000US4040Oklahoma001776629257233374587997POLYGON ((-103.00257 36.52659, -103.00219 36.6...
VA51017798030400000US5151Virginia001022577171108528531774MULTIPOLYGON (((-75.74241 37.80835, -75.74151 ...
WV54017798050400000US5454West Virginia0062266474513489028543POLYGON ((-82.64320 38.16909, -82.64300 38.169...
\n", "
" ], "text/plain": [ " STATEFP STATENS AFFGEOID GEOID NAME LSAD \\\n", "STUSPS \n", "MS 28 01779790 0400000US28 28 Mississippi 00 \n", "NC 37 01027616 0400000US37 37 North Carolina 00 \n", "OK 40 01102857 0400000US40 40 Oklahoma 00 \n", "VA 51 01779803 0400000US51 51 Virginia 00 \n", "WV 54 01779805 0400000US54 54 West Virginia 00 \n", "\n", " ALAND AWATER \\\n", "STUSPS \n", "MS 121533519481 3926919758 \n", "NC 125923656064 13466071395 \n", "OK 177662925723 3374587997 \n", "VA 102257717110 8528531774 \n", "WV 62266474513 489028543 \n", "\n", " geometry \n", "STUSPS \n", "MS MULTIPOLYGON (((-88.50297 30.21523, -88.49176 ... \n", "NC MULTIPOLYGON (((-75.72681 35.93584, -75.71827 ... \n", "OK POLYGON ((-103.00257 36.52659, -103.00219 36.6... \n", "VA MULTIPOLYGON (((-75.74241 37.80835, -75.74151 ... \n", "WV POLYGON ((-82.64320 38.16909, -82.64300 38.169... " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# load in the geodataframe and assign it to df\n", "df = load_states()\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Merging the data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to sucessfully create a choropleth map of the cattle data, we need to merge it with the geodataframe that contains all the information for creating the plots of the states. We can do this by using the `pandas merge` function. Since the index for the cattle data is `State` and our geodataframe contains a similar column (`NAME`) we can use this value to merge both dataframes. Let's start out by renaming the `NAME` column in our geodataframe to `State` so that the names of both columns match." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
STATEFPSTATENSAFFGEOIDGEOIDStateLSADALANDAWATERgeometry
STUSPS
MS28017797900400000US2828Mississippi001215335194813926919758MULTIPOLYGON (((-88.50297 30.21523, -88.49176 ...
NC37010276160400000US3737North Carolina0012592365606413466071395MULTIPOLYGON (((-75.72681 35.93584, -75.71827 ...
OK40011028570400000US4040Oklahoma001776629257233374587997POLYGON ((-103.00257 36.52659, -103.00219 36.6...
VA51017798030400000US5151Virginia001022577171108528531774MULTIPOLYGON (((-75.74241 37.80835, -75.74151 ...
WV54017798050400000US5454West Virginia0062266474513489028543POLYGON ((-82.64320 38.16909, -82.64300 38.169...
\n", "
" ], "text/plain": [ " STATEFP STATENS AFFGEOID GEOID State LSAD \\\n", "STUSPS \n", "MS 28 01779790 0400000US28 28 Mississippi 00 \n", "NC 37 01027616 0400000US37 37 North Carolina 00 \n", "OK 40 01102857 0400000US40 40 Oklahoma 00 \n", "VA 51 01779803 0400000US51 51 Virginia 00 \n", "WV 54 01779805 0400000US54 54 West Virginia 00 \n", "\n", " ALAND AWATER \\\n", "STUSPS \n", "MS 121533519481 3926919758 \n", "NC 125923656064 13466071395 \n", "OK 177662925723 3374587997 \n", "VA 102257717110 8528531774 \n", "WV 62266474513 489028543 \n", "\n", " geometry \n", "STUSPS \n", "MS MULTIPOLYGON (((-88.50297 30.21523, -88.49176 ... \n", "NC MULTIPOLYGON (((-75.72681 35.93584, -75.71827 ... \n", "OK POLYGON ((-103.00257 36.52659, -103.00219 36.6... \n", "VA MULTIPOLYGON (((-75.74241 37.80835, -75.74151 ... \n", "WV POLYGON ((-82.64320 38.16909, -82.64300 38.169... " ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# rename the 'NAME' column in the geodataframe to 'State'\n", "geo_df = df.rename(columns={'NAME': 'State'})\n", "geo_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Important:** To make sure that we do not accidentally loose any important data during the merge, we need to make sure that we include the `how='outer'` parameter in the merge statement." ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StateCattlePopulationRatioSTATEFPSTATENSAFFGEOIDGEOIDLSADALANDAWATERgeometry
0Alabama1260000.05039877.00.25001017797750400000US0101001311740485834593327154MULTIPOLYGON (((-88.05338 30.50699, -88.05109 ...
1Alaska18000.0732673.00.02502017855330400000US0202001478839695958245481577452MULTIPOLYGON (((179.48246 51.98283, 179.48656 ...
2Arizona960000.07276316.00.13204017797770400000US0404002941985511431027337603POLYGON ((-114.81629 32.50804, -114.81432 32.5...
3Arkansas1690000.03025891.00.55905000680850400000US0505001347688727272962859592POLYGON ((-94.61783 36.49941, -94.61765 36.499...
4California5200000.039237836.00.13306017797780400000US06060040350393131220463871877MULTIPOLYGON (((-118.60442 33.47855, -118.5987...
\n", "
" ], "text/plain": [ " State Cattle Population Ratio STATEFP STATENS AFFGEOID \\\n", "0 Alabama 1260000.0 5039877.0 0.250 01 01779775 0400000US01 \n", "1 Alaska 18000.0 732673.0 0.025 02 01785533 0400000US02 \n", "2 Arizona 960000.0 7276316.0 0.132 04 01779777 0400000US04 \n", "3 Arkansas 1690000.0 3025891.0 0.559 05 00068085 0400000US05 \n", "4 California 5200000.0 39237836.0 0.133 06 01779778 0400000US06 \n", "\n", " GEOID LSAD ALAND AWATER \\\n", "0 01 00 131174048583 4593327154 \n", "1 02 00 1478839695958 245481577452 \n", "2 04 00 294198551143 1027337603 \n", "3 05 00 134768872727 2962859592 \n", "4 06 00 403503931312 20463871877 \n", "\n", " geometry \n", "0 MULTIPOLYGON (((-88.05338 30.50699, -88.05109 ... \n", "1 MULTIPOLYGON (((179.48246 51.98283, 179.48656 ... \n", "2 POLYGON ((-114.81629 32.50804, -114.81432 32.5... \n", "3 POLYGON ((-94.61783 36.49941, -94.61765 36.499... \n", "4 MULTIPOLYGON (((-118.60442 33.47855, -118.5987... " ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = pd.merge(final_df, geo_df, on='State', how='outer')\n", "data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting the data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To plot the data we need to use the `plot_states` function in the geostates package." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "# import the plot_states() function from geostates\n", "from geostates.plot import plot_states" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "# create a choropleth map that displays the cattle to person ratio for each state in the United States\n", "# plot = plot_states(data_2, column='Ratio', cmap=new_cmap, labels='both', linestyle='none', legend='colorbar',\n", " #bins=15)\n", "\n", "# add a title to the plot\n", "# plot.annotate('Cattle to Person Ratio 2022', xy=(-97, 50.5), fontsize=18, ha='center');" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 2 }